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Abstract —Motivated by the demand for energy-efficient com¬ 
munication solutions in the next generation cellular network, 
a mixed-ADC architecture for massive multiple input multiple 
output (MIMO) systems is proposed, which differs from previous 
works in that herein one-bit analog-to-digital converters (ADCs) 
partially replace the conventionally assumed high-resolution 
ADCs. The information-theoretic tool of generalized mutual 
information (GMI) is exploited to analyze the achievable data 
rates of the proposed system architecture and an array of 
analytical results of engineering interest are obtained. For fixed 
single input multiple output (SIMO) channels, a closed-form 
expression of the GMI is derived, based on which the linear 
combiner is optimized. The analysis is then extended to ergodic 
fading channels, for which tight lower and upper bounds of the 
GMI are obtained. Impacts of dithering and imperfect channel 
state information (CSI) are also Investigated, and it is shown that 
dithering can remarkably improve the system performance while 
imperfect CSI only Introduces a marginal rate loss. Finally, the 
analytical framework is applied to the multi-user access scenario. 
Numerical results demonstrate that the mixed-ADC architecture 
with a relatively small number of high-resolution ADCs is able to 
achieve a large fraction of the channel capacity of conventional 
architecture, while reduce the energy consumption considerably 
even compared with antenna selection, for both single-user and 
multi-user scenarios. 

Index Terms —Analog-to-digital converter, dithering, energy ef¬ 
ficiency, generalized mutual information, massive MIMO, mixed- 
ADC architecture, multi-user access. 


I. Introduction 

The exponential increase in the demand for mobile data 
traffic imposes great challenge on the cellular network. In 
recent years, a heightened attention has been focused on 
massive multiple input multiple output (MIMO) systems, in 
which each base station (BS) is equipped with hundreds of 
antennas and serves tens of or more users simultaneously [1]- 
[2]. Because the large number of BS antennas can effectively 
average out noise, fading and to some extent, noncoherent 
interference, massive MIMO achieves significant gains in both 
spectral efficiency and radiated energy efficiency, and thus is 
envisioned as a promising key enabler for the next generation 
cellular network [3]-[4]. 

Thus far, most of the literature on massive MIMO assume 
a conventional architecture built on ideal hardware. However, 
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this assumption is not well justified, since the hardware cost 
and circuit power consumption scale linearly with the number 
of BS antennas and thus soon become practically unbear¬ 
able unless low-cost, energy-efficient hardware is deployed 
which however easily suffers from impairments. Assuming 
an additive stochastic impairment model, the authors of [5] 
examined the impact of hardware impairments on both spectral 
efficiency and radiated energy efficiency of massive MIMO. 
The authors of [6] obtained scaling law that describes how fast 
the tolerance level of impairments increases with the number 
of BS antennas while reaping much of the performance gain 
promised by massive MIMO. The authors of [7] examined the 
accuracy of widely used additive or multiplicative stochastic 
impairment models by providing a hardware-specific determin¬ 
istic model and performing comparative numerical studies. 

Due to the favorable property of low cost, low power 
consumption and feasibility of implementation [8]-[9], low- 
resolution analog-to-digital converters (ADCs) have also at¬ 
tracted ubiquitous attention in the field of energy-efficient 
design for wireless communication systems. For Nyquist- 
sampled real Gaussian channel, the authors of [10] established 
some general results regarding low-resolution quantization, 
showing that for a quantizer with Q bins, the capacity- 
achieving input alphabet should be discrete and needs not 
have more than Q mass points. The authors of [11] designed 
a modified minimum mean square error (MMSE) receiver 
for MIMO systems with output quantization and proposed a 
lower bound to the capacity. In [12], the authors investigated a 
practical monobit digital receiver paradigm for impulse radio 
ultra-wideband (UWB) systems. Recently, the authors of [13] 
examined the impact of one-bit quantization on achievable 
rates of massive MIMO systems with both perfect and es¬ 
timated channel state information (CSI). The authors of [14] 
addressed the high signal-to-noise ratio (SNR) capacities of 
both single input multiple output (SIMO) and MIMO channels 
with one-bit output quantization. 

Despite its great superiority in deployment cost and energy 
efficiency, one-bit quantization generally has to tolerate large 
rate loss, especially in the high SNR regime [14], thus high¬ 
lighting the indispensability of high-resolution ADC for digital 
receiver. Besides, the great overhead of pilot-aided channel 
estimation under one-bit quantization is also a big concern 
[12]-[13], [15]. Thus motivated by such consideration, in 
this paper we propose a mixed-ADC architecture for massive 
MIMO systems in which one-bit ADCs partially, but not 
completely, replace conventionally assumed high-resolution 
ADCs. This architecture has the potential of allowing us to 
remarkably reduce the hardware cost and power consumption 
while still maintain a large fraction of the performance gains 
promised by conventional architecture. 

For such mixed-ADC massive MIMO, although the channel 
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capacity is still the maximum mutual information between 
the channel input and the quantized channel output vector, 
from an engineering perspective, however, the mutual infor¬ 
mation maximization problem appears to be not completely 
satisfactory in providing engineering insights. Because in 
this situation, the mutual information is high-dimensional 
integration and summation which do not yield closed-form 
simplification as in linear Gaussian channels. Generalized 
mutual information (GMI) [16]-[17], on the other hand, allows 
one to analytically characterize the achievable date rates of 
low-complexity linear receivers that are particularly favorable 
for massive MIMO systems, and thus we leverage it to 
address the performance of the mixed-ADC architecture. As a 
performance metric for mismatched decoding, GMI has proved 
convenient and useful in several important scenarios such 
as fading channels with imperfect CSI at the receiver [17], 
channels with transceiver distortion [18]-[19] and analysis of 
bit-interleaved coded modulation [20]. 

Exploiting a general analytical framework developed in 
[18], we obtain a series of analytical results. First, we consider 
a fixed SIMO channel where the BS is equipped with N anten¬ 
nas but only has access to K pairs ^ of high-resolution ADCs 
and {N — K) pairs of one-bit ADCs, and derive a closed- 
form expression of the GMI. This enables us to optimize the 
linear combiner and further explore the asymptotic behaviors 
of the GMI in both low and high SNR regimes that in turn 
suggest a plausible ADC switch scheme. Besides, the benefit 
of dithering is also investigated, for which we propose a simple 
but effective dithering scheme, which achieves remarkable rate 
gain, especially for the case of small K. 

The analysis is then extended to the scenario of ergodic 
fading channels where, instead of directly working with the 
exact GMI, we derive lower and upper bounds of the GMI, 
which are shown to be very tight by numerical study. More¬ 
over, numerical results reveal that the mixed-ADC architecture 
with a small number of high-resolution ADCs suffices to attain 
a large portion of the channel capacity of conventional ar¬ 
chitecture and meanwhile outperforms antenna selection with 
the same number of high-resolution ADCs^. The robustness 
of the mixed-ADC architecture against imperfect CSI is also 
investigated. In this paper, we only utilize the high-resolution 
ADCs to perform channel estimation, and thus the deduced 
estimation error is Gaussian distributed in Rayleigh fading 
channels, allowing us to analytically characterize the resulting 
GMI as well as its lower and upper bounds. Numerical results 
show that the lower and upper bounds are again very tight and 
that there is only a marginal rate loss due to imperfect CSI. 

Finally, we apply our analysis to the multi-user access 
scenario. The corresponding numerical results indicate that 
when equipped with a small number of high-resolution ADCs, 
the mixed-ADC architecture also achieves a large fraction 
of the achievable rate of conventional architecture and again 
outperforms antenna selection with the same number of high- 
resolution ADCs. 

*A pair of ADCs quantize the I/Q components of an antenna, respectively. 

^In conventional architecture, each BS antenna is followed by a radio 
frequency (RF) chain built on ideal hardware. Meanwhile, by antenna selection 
we mean that there are only K ideal RF chains available at the BS. 


In addition, energy efficiencies of the mixed-ADC archi¬ 
tecture and of antenna selection are compared, taking that 
of conventional architecture as a baseline. Numerical results 
reveal that under the same spectral efficiency loss, both the 
mixed-ADC architecture and antenna selection achieve signif¬ 
icant energy reduction. Moreover, the mixed-ADC architecture 
always outperforms antenna selection, especially in the multi¬ 
user scenario. In summary, the mixed-ADC architecture strikes 
an attractive balance between spectral efficiency and energy 
efficiency, for both single-user and multi-user scenarios. 

The remaining part of this paper is organized as follows. 
Section II outlines the system model. Adopting GMI as the 
performance metric. Section III establishes the theoretical 
framework for fixed SIMO channels, based on which the 
optimal linear combiner and the asymptotic behaviors of the 
GMI in both low and high SNR regimes are explored. Besides, 
performance improvement through dithering is also investi¬ 
gated. Then, Section IV extends the theoretical framework 
to ergodic fading channels and evaluates the the effects of 
imperfect CSI on the system performance. Section V applies 
the theoretical framework to the multi-user access scenario. 
Furthermore, energy efficiency of the mixed-ADC architecture 
is assessed in Section VI. Numerical results are presented 
in Section VII to corroborate the analysis. Finally, Section 
VIII concludes the paper. Auxiliary technical derivations are 
archived in the appendix. 

Notation: Throughout this paper, vectors and matrices are 
given in bold typeface, e.g., x and X, respectively, while 
scalars are given in regular typeface, e.g., x. We use ||x||i 
and ||x|| to represent the 1-norm and 2-norm of vector x, 
respectively, and let X*, X^ and X^ denote the conjugate, 
transpose and conjugate transpose of X, respectively. Normal 
distribution with mean jj, and variance is denoted by 
N(/ 2 ,(T^), while CN(/x, C) stands for the distribution of a 
circularly symmetric complex Gaussian random vector with 
mean and covariance matrix C. Superscripts R and I are 
used to indicate the real and imaginary parts of a complex 
number, respectively, e.g., x = + i ■ x^, with i being the 

imaginary unit. We use sgn(a;) = sgn(a;'^) -|- i ■ sgn(a;^) to 
denote the sign function of a complex number x, and log(a;) 
to denote the natural logarithm of positive real number x. 

II. System model 

Several scenarios will be addressed in this paper, including 
fixed SIMO channels, ergodic fading SIMO channels with per¬ 
fect or imperfect CSI at the receiver, and multi-user channels 
with multiple single-antenna users and a multi-antenna BS. In 
this section, we describe the fixed SIMO channel model, and 
the remaining scenarios will be introduced in later sections. 

As aforementioned, we consider a single-user system, where 
a single-antenna user communicates with an 7V-antenna BS. 
Moreover, we consider a narrow-band channel modeP, for 

^Throughout this paper we focus on a narrow-band channel model, similar 
to those considered in, e.g., [5]-[7], [13]-[i4], [19], among others. Wideband 
channel model includes multi-path effect, which can still be treated using the 
general framework of GMI, and will be treated in a separate work; a further 
discussion is in Section VIII. 
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Fig. i. Illustration of the system architecture. It is perhaps worth noting that the ADC switch module can also be placed before the RF chains. In this 
manner, the RF chain followed by a pair of one-bit ADCs can be manufactured with lower quality requirements and consequently we can further reduce the 
power consumption and hardware cost. On the other hand, switch at radio frequency may be more challenging and costly than at baseband. Which choice is 
favorable will be determined by practical engineering. 


which the channel vector h is fixed throughout the transmis¬ 
sion of the codeword and is assumed to be perfectly known by 
the BS. Then the received signal at the BS can be expressed 
as 

y^=ha;*+z*, for Z = 1, 2,L, (1) 


where is the complex signal transmitted at the Z-th symbol 
time, z’’ ^ C3sr(0, cr^I) models the independent and identically 
distributed (i.i.d.) complex Gaussian noise vector, and L is the 
codeword length. 

In practice, the received signal at each antenna is quan¬ 
tized by a pair of ADCs, one for each of the in-phase and 
quadrature (I/Q) branches, so that further signal processing 
can be performed in the digital domain. Despite of this, 
most of the literature on receiver design assume ADC with 
virtually infinite precision for the tractability of analysis. For 
a large BS antenna array, however, such assumption is no 
longer justified since the cost and energy consumption of 
conventional architecture scale linearly with the number of 
BS antennas, which will soon become the system bottleneck. 
Therefore, we propose a mixed-ADC architecture in which 
only 2K high-resolution ADCs are available and all the other 
2{N — K) ADCs are with only one-bit resolution"^. We further 
let the I/Q outputs at each antenna be quantized by two ADCs 
of the same kind. Thus the quantized output is 


rl = Qn{yl) 


hnX “b if — 1, 

sgn(Zi„a;'-b if = 0, 


( 2 ) 


for Z = n = Here 5n C {Oil} is an 

indicator; = 1 means that the ADCs corresponding to the 
n-th antenna are high-resolution, whereas Sn = 0 indicates 
that they are with one-bit resolution. Here for simplicity we 
assume sufficiently high resolution for Sn = 1, so that the 
residual quantization noise is negligible then. 


^Note that a one-bit ADC is particularly simple to implement in hardware, 
say, using a polarity detector [12]. Furthermore, the analytical approach we 
adopt in this work, based on the general framework in [18], can be extended 
to other types of ADCs. 


To make the expression compact, we introduce = 1 — 
and rewrite (2) as 

rl=6n- {hnx’- -b Zn) + Sn ' sgn(hnx‘ + z^) . (3) 

Then, we define an ADC switch vector S = [bi,..., 
which follows the subsequent restriction 

N 

Pl|l=^<5n=i^, (4) 

n—1 

and should be optimized according to the channel h so that the 
limited number of high-resolution ADCs will be well utilized 
to enhance the system performance. 

For transmission of rate R, the user selects a message m 
from M = { 1 , 2 ,..., [2^^J} uniformly randomly, and maps 
the selected message to a transmitted codeword, i.e., a length- 
L complex sequence, {x’'{m)}lLi. In this paper, we restrict 
the codebook to be drawn from a Gaussian ensemble; that is, 
each codeword is a sequence of L i.i.d. CAf(0,£s) random 
variables, and all the codewords are mutually independent. 
Such a choice of codebook satisfies the average power con¬ 
straint X SrLi ^ £s- We define the SNR as 

SNR = and let cr^ = 1 thereafter for convenience. 

As is well known, without receiver distortion, the Gaus¬ 
sian codebook ensemble together with nearest-neighbor de¬ 
coding achieves the capacity of conventional architecture^ 
C{N,N,0) = log(l -b ||h|pSNR), as the codeword length L 
grows without bound. With {N — K) pairs of one-bit ADCs, 
the channel capacity C{N, AT, N — K) is less than C{N, N, 0) 
due to information loss during quantization. 

As discussed in the introduction, instead of numerically 
evaluating C{N,K,N — K), in the following, we adopt the 
nearest-neighbor decoding rule at the decoder, and leverage the 

^For an A^-antenna SIMO channel, we let C(N, Ki, K 2 ) denote its 
capacity when equipped with Ki pairs of high-resolution ADCs and K 2 
pairs of one-bit ADCs, where 0 < Ki, K 21 Ki + K 2 < N. Particularly, 
for the mixed-ADC architecture, we have K\ = K and K 2 = N — K; for 
antenna selection, we have Ki = K and K 2 = 0, discarding the outputs of 
{N — K) antennas. 
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general framework developed in [18] to investigate the GMI of 
the mixed-ADC architecture. The GMI acts as an achievable 
rate and thus also a lower bound of C{N, K,N — K). To this 
end, we introduce a linear combiner® to process the channel 
output vector, as illustrated in Figure 1. Thus the processed 
channel output is 

(5) 

for I = 1 ,..., L, where w is designed according to the channel 
h and the ADC switch vector 6. 

With nearest-neighbor decoding, upon observing 
the decoder computes, for all messages, the Euclidean dis¬ 
tances 

1 ^ 

£>(m) = — ^ — ax*(m)p, m G M, (6) 

^ 1=1 

and decides the received message as the one that minimizes 
( 6 ). Here the scaling parameter a is adopted to adjust the power 
imbalance between the channel input x^ and the processed 
output x’" contributed collectively by the channel, one-bit 
quantization and the linear combiner, and should be selected 
appropriately for optimizing the decoding performance. 


III. GMI AND Optimal Combining 
A. GMI of the Proposed System Framework 

From now on, we suppress the time index I for notational 
simplicity. To facilitate the exposition, we summarize (3) and 
(5) as 

X = w^r = f{x,h,z), (7) 

where /(•) is a memoryless nonlinear distortion function that 
incorporates the effects of output quantization as well as linear 
combining, and maps the triple {x, h, z) into the processed 
output X. Although S and w are made invisible in the function 
/(•) since they are both determined by h, we need to keep in 
mind that /(•) implicitly includes 6 and w. 

We apply the general framework developed in [18] to 
derive the GMI of the system architecture. The GMI is a 
lower bound of the channel capacity, and more precisely, it 
characterizes the maximum achievable rate under the specified 
random codebook (Gaussian ensemble here) and the specified 
decoding rule (nearest-neighbor decoding here) such that the 
average decoding error probability (averaged over the code¬ 
book ensemble) is guaranteed to vanish asymptotically as 
the codeword length grows without bound [17]. Particularly, 
conditioned on w and 6, the GMI takes the following form 
analogous to [18, Eq. (89)]; that is. 


-^GMl(w, J)= sup 
aec,e<o 


9E[\fix,h, 


z — ax\ — 




6 >E[|/(a;,h, 2 )p] 

l- 0 |a| 2 £s 


-|-log(l-0|a| 



( 8 ) 


where the expectation is taken with respect to x and z. The 
parameter a is in the nearest-neighbor decoding rule ( 6 ), 


*There should be some nonlineai' receiver that outperforms the linear' one 
in this paper, which will be studied in a future work. 


and the parameter 0 is from the underlying large-deviations 
argument, — for further details about the derivation of the 
expression, we refer to [17] [18]. Then we can solve the 
optimization problem in ( 8 ), following essentially the same 
line as [18, App. C], and obtain an explicit expression of the 
GMI as follows. 


Proposition 1. With Gaussian codebook ensemble and 
nearest-neighbor decoding, the GMI for given w and 5 is 


-fGMl(w, = log 



1 - k{w,S)) ’ 


(9) 


where the parameter k ( w , 8) is 

|E[/*(x,h,z)-xjp 
£sE[|/(x,h,z)P] ■ ^ 

The corresponding optimal choice of the scaling parameter a 
is 

E[f{x,h,z) ■ X*] 


aopt(w, 5) = 


( 11 ) 


We note that the expectation is taken with respect to x and z. 

It is worth noting that k ( w , 8) is the squared correla¬ 
tion coefficient of channel input x and the processed output 
/(a;,h,z), and thus is upper bounded by one, from Cauchy- 
Schwartz’s inequality. Moreover, /gmi(w, J) is a strictly in¬ 
creasing function of k{w,S) for k{w,S) G (0,1). Therefore, 
in the following, we will seek to maximize k ( w , S) by 
choosing well designed linear combiner w and ADC switch 
vector 8. To this end, we first derive a closed-form expres¬ 
sion for k ( w , 8). The result is summarized by the following 
proposition. 


Proposition 2. Given w and 8, for (10) in Proposition 1, we 
have 


k ( w , 8) 


■W'^Ri.a;R^W 
£sW^RrrW ’ 


( 12 ) 


where Rra; is the correlation vector between r and x, with its 
n-th element being 


(n'ra:)n — 


7r(|/i„p£s + 1 ) 


(13) 


and Rrr is the covariance matrix ofr, with its (n, m)-th entry 
being (Rrr)n,m = 


1 + 8n ■ |llnp£s + 8ni 

8ji6jYi -\- 8ji8jyi 


■7r(|/tmP£a-|-l) 


-b 


8n8m ■ J ,r(|/t„p£,-ri) 


c c 4 
Cti Crn.’ ~ 




arcsinf , 


z-arcsm 


inf 






if n = m, 


if n ^ m. 
(14) 
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The corresponding optimal choice of the scaling parameter a 
in (11) is 

aopt(w,^) = —w^Rrx- (15) 

Proof: See Appendix-A. ■ 

B. Optimization of Linear Combiner 

In the previous subsection, the GMI of the system archi¬ 
tecture is derived, as a function of h, w and 5. In this 
subsection, we turn to the optimization of w such that the GMI 
is maximized for given h and <5. The subsequent proposition 
summarizes our result. 

Proposition 3. For given h and 8, the optimal linear combiner 
w takes the following form 

^opt ^rr ^ra:; (16) 


Corollary 2. For the special case of K = N, the optimal 
linear combiner (16) reduces to a maximum ratio combiner 
(MRC). Thus in this case, the GMI coincides with the channel 
capacity of conventional architecure C{N, N,0). 


Proof: For the special case of K = N, i.e., 8 = 1, (13) 
reduces to Rra; = Egh, and (14) reduces to Rrr = I-|-£shh^. 
Then, the optimal combiner (16) turns out to be an MRC, since 

Wopt = R-'Rr. = (20) 

Consequently, it is straightforward to verify that the effective 
SNR in (9) is 


«(Wopt,^) 

1 - K(Wopt,^) 

thus completing the proof. 


|hf£s, 


( 21 ) 


which is in fact a linear MMSE combiner that minimizes the 
mean squared estimation error of x upon observing r among 
all linear combiners. The corresponding k ( w , 8) is 

~ ttopt (^opt; ~ ■^^ra:^rr (12) 

Os 


Proof: Noticing that Rrr is a positive semidefinite Her- 
mitian matrix, from (12) we have 


k ( w , (5) = 


< 


1 [w^RrrP 

£s W^fRrrW 

1 |w^R^/"RrV^"Rr.P 
£s llw-f^Rrr^P 
1 IIW^R^/^P ■ ||Rr~r^/^Rr.P 
£s ||w^rJ(^^||2 

l||Rr-r'/"Rr.f, 

C-S 


(18) 


where the inequality follows from Cauchy-Schwartz’s in¬ 
equality, which holds equality if and only if w^Rrr^ = 
(R~r^^^Rra:)^, i.C., Wopt = R"^Rrx- ■ 

The subsequent corollary demonstrates that the mixed- 
ADC architecture achieves better performance than antenna 
selection with the same number of high-resolution ADCs. 


Corollary 1. Suppose that the high-resolution ADCs are 
switched to the antennas with the strongest K link magnitude 
gains, and denote the corresponding ADC switch vector as 8 . 
Then, the following relationship 

/GMi(wopt,^') >C'(iV,iT,0) (19) 

holds, where C{N, K, 0) = log(l -I- ' |(inp£s) A the 

capacity of the antenna selection solution. 


Proof: Provided that the high-resolution ADCs are 
switched according to 8 , by specifying Wn = 5^ ■ hn, 
n = 1,..., Af, it is straightforward to verify that /gmi(w, 8 ) = 
C{N, K,0). Since this choice of w is not optimal, we have 
fGMi(wopt, > /gmi(w, <5') and (19) follows. ■ 

When K = N, i.e., all the N pairs of ADCs are high- 
resolution, we have the following corollary of Proposition 3. 


C. Asymptotic Behaviors of Iopt, 8) 

In the previous subsection, the optimal linear combiner for 
the mixed-ADC architecture is derived. Thus we are ready 
to examine its asymptotic performance in both low and high 
SNR regimes. Letting SNR tend to zero, we have the following 
corollary. 

Corollary 3. As £s —>■ 0, for given 8 we have 
^ ( - 2\ 

fGMl(Wopt, = X! ( ^ + o(£s)- (22) 

See Appendix-B for its proof. Comparing with ON, N, 0) 
in the low SNR regime, i.e., C{N,N,0) = \^n\'^£-s + 

o(£s), we conclude that part of the achievable rate is degraded 
by a factor of 2. due to one-bit quantization. The expression 
(22) also suggests that, in the low SNR regime, high-resolution 
ADCs should be switched to the antennas with the strongest 
K link magnitude gains. 

For the high SNR case, the subsequent corollary collects 
our results. 

Corollary 4. Ai £s —^ oo, for given 8 we have the effective 
SNR in (9) as 

tt(Wopt,(5) 2 c [4-bO(l/£s)]q^B"iq 

l-K(wopt,^) TT - [4-b 0(l/£s)]q^B-iq’ 

(23) 

with p, q, and B given in (69) and (73). Ai a result, 
7GMi(Wopt,^) scales as 

7GMi(wopt,^) = 21og||p|| -blog(£s) + 0(l/£s)- (24) 

Besides, for the special case of pure one-bit quantization, i.e., 
K = 0, we get 

lim /GMi(wopt, (5) == log IH- , (25) 

Es^oo \ TT —4q"B ^q/ 

where B is also given by (73) suppressing all the 0(l/£s) 
terms. 

The proof is given in Appendix-C. From (23) we notice 
that the contributions of high-resolution ADCs and one-bit 
ADCs in the high SNR regime are separate, as the first term 
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corresponding to high-resolution ADCs increases linearly with 
£s, whereas the second term coming from one-bit ADCs 
tends to a positive constant independent of fig. Comparing 
with Corollary 3, we infer that one-bit ADCs are getting less 
beneficial as the SNR grows large, as will be validated by 
numerical study in Section VII. In addition to these, (24) 
suggests for high SNR that, high-resolution ADCs should 
also be switched to the antennas with the strongest K link 
magnitude gains. 

For the special case of pure one-bit quantization, (25) 
indicates that the corresponding GMI approaches a finite limit, 
and thus the rate loss due to one-bit quantization is substantial. 
This is much different from the conclusion we get in the 
low SNR regime, where one-bit quantization degrades the 
achievable rate only by a factor of The reason underlying 
this phenomenon is that the amplitude of the transmit signal 
cannot be recovered at the receiver when Eg is sufficiently 
large, and thus further enhancing the SNR does not help 
in improving /GMi(wopt, <5) (see also [21] [22] for similar 
phenomena). 

D. Performance Improvement via Dithering 

In the previous part of this section, we derived the optimal 
linear combiner and explored the asymptotic behaviors of 
fGMi(wopt,^) in both low and high SNR regimes. As will 
be revealed by the corresponding numerical study in Section 
VII, increasing SNR may indeed degrade the GMI when the 
SNR exceeds a certain threshold that depends on a collection 
of system parameters. In this situation, Gaussian noise, as a 
special type of dither, can expand the effective bit-width of 
one-bit ADCs and thus helps reduce the estimation bias [7] 
[23]. Uniform dithering is known to be asymptotically optimal 
under certain problem setups [23], but its non-asymptotic 
analysis is not amenable to analysis. Therefore, we adopt 
Gaussian dithering and investigate its impact on the system 
performance. 

We consider a dithering strategy, which injects additional 
Gaussian noise into the antenna output before quantization 
when the corresponding pair of ADCs are one-bit and the 
receive SNR of the antenna, |/i„p£g, exceeds a prescribed 
threshold T. The power of the injected Gaussian noise is 
adjusted so that the resulting receive SNR of this antenna after 
dithering is pulled back to T. Accordingly, we rewrite (2) as 

{ hjiX “t" Zji^ if Sfi — 1, 

sgn(hnX + z„), if (5„ = 0, |/i„p£s < 

sgn(hnX + Zn + z^), if(5„=0, |/i„p£g > 1, 

(26) 

where the Gaussian dither z^ ^ CN(0, j/inpEg/T — 1) is 
independent of Zn so that Zn + z^ CN(0, j/inpEg/T). Since 
high SNR is always favorable for high-resolution ADC, we do 
not perform dithering for antennas with high-resolution ADCs. 

The system architecture and optimal linear combiner devel¬ 
oped in Section III still apply directly, except that we need to 
make some modifications about Rr^ in (13) and Rrr in (14); 
for any n € {1, 2,..., N}, whenever = 0 and |/i„p£g > T, 
we make the following substitution, 

|/i„p£g + l^ |/i„|2£,(1 + 1/T), (27) 


in (13) and (14). The optimal threshold Topt depends on K, 
N, and SNR. For the situation with relatively small K, the 
dependence of Topt on K is actually negligible. Nevertheless, 
the analytical optimization of T is still difficult, and thus we 
perform a numerical search. To be specific, for any given SNR 
and N, we find the optimal threshold Topt for K = 0 through 
a Monte Carlo simulation, and then use Topt to evaluate the 
performance gain with AT > 1 as well. Numerical results will 
be presented in Section VII. 

IV. Ergodic Fading Channels 

Although our analysis thus far has been for the fixed channel 
scenario, the analytical framework developed can be extended 
to the the randomly varying channel scenario. We assume 
that the channel fading process {h*} obeys the block fading 
channel model among coherence intervals. We start with the 
perfect CSI situation and then investigate the impact of channel 
estimation error on performance. 


A. Perfect CSI 

Since the channel vector h varies over time now, w and 6 
in this situation shall be designed based on the instantaneous 
channel realization. In this situation, the GMI becomes^ 


fGMi = sup 6'Ex.z,h[|/(a;, h, z) - ax\^]- 
aGC,e<0 \ 


^Ea;,z,h[|/(x,h,z)p] 

l-0|aP£s 


+log(l-0|a|2£,) . (28) 


Notice that it shares the same nominal form as (8) except 
that the expectation here is over x, z, and h. Recognizing the 
difficulty of this optimization problem, we turn to evaluate the 
lower and upper bounds of /gml and arrive at the following 
proposition. Numerical results will be given in Section VII to 
verify the tightness of the lower and upper bounds. 


Proposition 4. For the ergodic fading channel scenario, lower 
and upper bounds of /gmi are given by 

Eh[K(Wopt,<5)] 


rlow^er 

-^GMI 


= log 1 -b 


log 1 -b 


respectively, where K(Wopt,<5) is given by (17). 


1 -Eh[«:(wopt,^)] 
K(Wopt,^) 


(29) 

(30) 


Proof: Following a similar procedure as [18, App. C], we 
obtain k in this situation as 

_ |E^,^,h[/*(a^,h,z) • x]p 

£sEa;,z,h[|/(a;, h, z)P] 

which shares exactly the same form as (10), except that the 
expectation is taken over x, z, and h. The maximization of 
K shall be accomplished by optimizing the linear combiner. 
Therefore by specifying w to be designed according to (16), 
we get a lower bound of the optimal k, since this design is just 


^Here for simplicity we consider a fixed value of a in the nearest neighbor 
decoding metric. Allowing a to vary based on h* may result in some 
performance improvement especially when N is not too large. 
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one of the feasible options and thus is not necessarily optimal; 
that is 

_ \Eh[Ea;^^[f*{x,h,z) ■ a:|h]]p 

^ £sEh[]E^,z[|/(a;,h,z)|2|h]] 

^ |Eh[w-^Rrx]P 

“ £sEh[w^RrrW] 

= Eh[K(wopt,^)], (32) 

where the last equation comes from (16)-(17). Consequently, 
we obtain the lower bound of /gmi as given by (29). 

To prove (30), we first rewrite (28) as 


7gmi = sup Eh[ dE^^^[\f{x,h,z) - ax\'^\h]- 
oec.6i<o \ 


^Ea;,z[|/(a:^,h,z)p|h] 

l-0|aP£, 


+ log(l-6i|ap£s) . (33) 


Then, to derive the upper bound we simply exchange the order 
of supremum operation and the expectation over h. This leads 
to 


/GMi<Eh( sup (6>E:z,z[|/(a:, h,z) - axplh]- 
y aec,e<o ^ 


6'Ez,z[|/(a^,h,z)p|h] 

l-0|a|2£s 


+ log(l-6»|ap£s) H. (34) 


Consequently, (30) follows directly from (8) and the subse¬ 
quent results established for the fixed SIMO channels. ■ 


B. Training and Effect of Imperfect CSI 

Our results derived thus far are based on the perfect CSI 
assumption. In practice, however, CSI needs to be either ex¬ 
plicitly or implicitly acquired, say, via channel estimation. The 
channel estimation procedure with coarsely quantized channel 
outputs is both inefficient and elusive for analysis. Therefore, 
to study the robustness of the mixed-ADC architecture to 
imperfect CSI, in this paper we only utilize the high-resolution 
ADCs to perform channel estimation. 

Specifically, we estimate the channel vector in a round- 
robin manner, by which we link the K pairs of high-resolution 
ADCs to the first K antennas and estimate the corresponding 
channel coefficients hi,...,hK at the first symbol time, turn 
the K pairs of high-resolution ADCs to the next K antennas 
and estimate hx+i ,..., h 2 K at the next symbol time, and so 
on. Thus the training phase lasts about N/K symbol times*. 
To simplify analysis, in this subsection we assume that each 
antenna follows i.i.d. Rayleigh fading, so that ^ 624(0,1), 
n = 1,2,... ,N. An MMSE estimator is adopted at the BS, 
and thus without loss of generality, we can decompose hn into 

“b hn, 77 — 1, ..., , (35) 


where hn ^ 624(0,1 —cr^) is the estimated channel coefficient, 
while hn ^ 624(0,(7^) accounts for the independent estima¬ 
tion error. Accordingly, we define the MSE of the channel 
estimation as MSEt = a^. 

In this situation, the linear combiner w and the ADC switch 
vector d should be designed based on the channel estimate 
h. Besides, we rewrite /(a;,h,z) as /(x, h, h, z) in order to 
incorporate the effect of channel estimation. Then with some 
modification, our analysis developed in the last subsection still 
applies for the imperfect CSI case. To proceed, we have 


T-N/K 

^GMI —-- sup 

r aeC,6'<0 


9E[\f{x, h, h, z) — ax\' 


6 >E[|/(x,h,h,z)|^ 

l-0|a|2£. 


-blog(l-6l|ap£s) , (36) 


which obeys an analogous form as (28), except that the 
leading coefficient accounts for the rate loss due 

to channel training (T is the coherence interval length), and 
that the expectation here is taken with respect to x, h, h, 
and z. Exploiting a similar argument as that in the proof of 
Proposition 4, we arrive at the following proposition. 


Proposition 5. For block fading channels with imperfect CSI, 
a lower bound of Iq^jh is 


rim,l _ 

-^GMI ~ 


T - N/K , 
- 7^ -log 



1-Ej,[7c(w™,^)]j 


(37) 


and an upper bound of is 

log 1 1 + 


_ T - N/K 


rllll,U _ 

-'GMI “ 


T 




i(s 




1 - K(w™t,5) 


■ (38) 


Here, and k(w™j, 5) also come from (16) and (17), but 
we need to replace Rrx with R™ = Ej^[Rra;], and replace 
Rrr with R™ = Ej^[Rrr]. 


V. Extension to multi-user scenario 

In this section, we consider a multi-user system where the 
BS serves M single-antenna users simultaneously. The CSI is 
assumed perfectly known by the BS, and there are still only 
K pairs of high-resolution ADCs available. 


A. Fixed Channels 

Again, we start from the fixed channel case. The channel 
matrix between the users and the BS is denoted by H = 
[hi,...,hAf] e i.e., h„ = [hin, .■.,hMn]'^ collecting 

the channel coefficients related to the n-ih antenna at the BS. 
We write the quantized output at the n-th antenna, with user 
j considered, as 


^For example, a BS equipped with 100 antennas and 20 pairs of high- 
resolution ADCs would consume 5 symbol times in each coherence interval 
for channel estimation. This overhead is acceptable for slowly or moderately 
varying fading channels; for example, in [3] the channel coherence interval 
length is taken as 196, which is also used by us in the subsequent simulations. 
The efficiency and quality of channel training may be improved by jointly 
exploiting high-resolution ADCs and one-bit ADCs, which is an interesting 
and important topic for future research. 


M 


M 






b„-sgn ^ /i, 


in^L \ I ^ 






(39) 


where x,. ~ 624(0, £s) denotes the i.i.d. coded signal dedicated 
to the t-th user, and h^^nXi + Zn summarizes the co¬ 

channel interference and noise for the considered user j. Eor a 















fair comparison, the SNR in this situation is defined as SNR = 
M£s, reflecting the total transmit power from all the users. 

Following a similar derivation procedure as that in Section 
III, we get the GMI of the considered user. The proof is 
omitted for concision. 


Proposition 6. For given H and d, when treating other users ’ 
signals as noise, the GMI of user j is 

( ^mu \ 

1 + 13^ j ’ (40) 

where the parameter is 

is the correlation vector between and Xj, with its 
n-th entry given as 


/Tjmu\ _ 7 c 


'£s +1) 


(42) 


and R™ is the covariance matrix o/r™“, with the {n,m)-th 
entry being = 


1 + (5„ • ||h„|p£s + 5m 


h^h* £ 

'■'■n 


5n5m + 5„5m ' J ,r( ||h,„ |P 8a + l) 


■ V 77(||h„P£a + l) 


5r,.5m-^ 


(h^h:;,)^£a 


arcsinf , , 

VVllh„pe.+iVl|h,77lP£a+i 

(h^h;;.)j£, 


i-arcsin( , , 

VVl|h„P£s + lVl|hmP£s + l 


if n = 


if n 7^ TO. 
(43) 


In the multi-user scenario, there is no clear clue about how 
to switch the high-resolution ADCs. To obtain some hint, 
we explore the asymptotic behavior of (40) in the low SNR 
regime, leading to the corollary below. 

Corollary 5. When £s —>■ 0, for given H and 8, we have the 
GMI of user j as 

^ / _ 2\ 

-^GMI ~ ^ ^ f ~ j ’ \hjn'^^s + o(£s)- (44) 

The proof procedure is virtually the same as Appendix-B 
and thus is omitted. We notice that /qmj behaves analogously 
with Igmi in the low SNR regime, which is foreseeable as 
the system is now noise-limited. The sum GMI now equals 
Sn=i {^n + 4 • f) • |ft.j„p£s + o(£s), which suggests 

that the K pairs of high-resolution ADCs may be switched to 
the antennas with the K largest 

The asymptotic behavior of /qm: tn the high SNR regime 
is analytically intractable, and thus there is no generally 
convincing ADC switch scheme for the multi-user scenario. 
For this reason, we consider two heuristic switch schemes in 
the numerical study. 

• Random switch; high-resolution ADCs are switched ran¬ 
domly. 


• Norm-based switch; as suggested by Corollary 5, high- 
resolution ADCs are switched to antennas with the K 
largest 

Numerical results will be given in Section VII to examine the 
performance of both switch schemes. 


B. Ergodic Fading Channels 

The analysis is then naturally applied to ergodic fading 
channels, as summarized by the subsequent proposition. Nu¬ 
merical study will also be conducted in Section VII to verify 
the tightness of the lower and upper bounds. 


Proposition 7. For ergodic fading channels, lower and upper 
bounds of the GMI for user j are 


'GMI 


'GMI 


= log 1 + 




= E 


H 


1 — Eh[^ 
/ 

logfl 


muj 

mu 


1 - k” 


where the parameter k™" is given by (41). 


(45) 

(46) 


VI. Energy Efficiency 

We establish the power models for conventional architecture 
(CA), antenna selection (AS), and mixed-ADC architecture 
(MA). Only the circuit power consumption is taken into 
account, since first, we focus on the receiver design, and 
second, the power expenditure on digital signal processing is 
approximately independent of the choice of receivers all of 
which are based on linear combining. Then power models of 
the three considered receivers are 

PCA = .1V(PlNA + -Pmix + ^’aDC + Pfi\) + Psyn, 

Pas = K (Plna + ^mix + Padc + + Psynj 

PmA = W(PlNA + Pmix + Pfil) + AIPaDG + Riyn,(47) 

where Plna, ^mix, Paj 3 C, Pfa, and Pgyn account for the 

power consumption of low noise amplifier (ENA), mixer, 
a pair of high-resolution ADCs, Alters, and frequency syn¬ 
thesizer (which is typically shared among all the antennas 
in practice), respectively. Power consumption due to one- 
bit ADCs is neglected, since they can be implemented as 
polarity detectors using discrete components and thus the 
power consumption is marginal compared with other parts of 
the circuitry. 

We refer to a widely used model [24] to determine the power 
consumption parameters. Bandwidth in [24] is taken to be 1 
MHz at a carrier frequency of /c = 2 GHz, while in this paper 
we assume a bandwidth of P = 40 MHz® at the same carrier 
frequency. To account for this scaling, realizing that the power 
consumption of RE front-end except ADC is insensitive to the 
bandwidth*® but the power consumption of an ADC scales 

®Note that LTE-Advanced supports 15-100 MHz bands in TDD uplink [25]. 
Besides, a bandwidth of 40 MHz would be necessary for supporting an average 
per-user rate of 100 Mbps for future 5G. 

*®See [26] for example, where the signal bandwidth ranges from 0.5 MHz to 
50 MHz, but the RF front-end except ADC power consumption only changes 
from 20 mW to 40 mW, and the change is mainly due to the fluctuation of 
receiver gain and noise figure. 
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Number of high-resolution ADC pairs: K 


Fig. 2. Outage-GMI of the mixed-ADC architecture for different numbers 
of high-resolution ADC pairs, N = 100, Pout = 5%. 



Number of high-resolution ADC pairs: K 


Fig. 3. Lower and upper bounds of the GMI for ergodic fading channels 
with perfect CSI at the BS, N = 100. 



GMI lower bound: perfect CSI 

. GMI lower bound: Imperfect CSI 

— - ' GMI upper bound: Imperfect CSI 
-Capacity: CA w/ perfect CSI 


q1-^^^^^^^^-1 

10 20 30 40 50 60 70 80 90 100 

Number of high-resolution ADC pairs: K 


Fig. 4. GMI of the mixed-ADC architecture with imperfect CSI: impact of 
K, N = 100, T = 196, MSEt = -lOdB. 



-4 -6 -8 -10 -12 -14 -16 -18 -20 

MSEt[dB] 


Fig. 5. GMI of the mixed-ADC architecture with imperfect CSI: impact of 
MSEt, N = 100, T = 196, K = 20. 


linearly with the bandwidth, we update the power consumption 
parameters as; Plna = 20 mW, Pmix = 21 mW, Psyn = 
67.5 mW, Pfii = 5 mW, and Padc = 234 mW. As a side 
note, for many high-speed applications, high-resolution ADCs 
generally accounts for a dominant portion of the circuit power 
consumption; — in some recent works (e.g., [27]), only the 
ADC power consumption is taken into account, ignoring the 
other RF front-end parts. 

Energy efficiency is sometimes defined as the number of 
information bits conveyed per joule energy consumption. But 
this ratio alone does not capture the whole story, since the 
improvement of energy efficiency is valuable only if a desired 
spectral efficiency is ensured. For this reason, in this paper 
we characterize the energy efficiency using two performance 
metrics: normalized spectral efficiency and normalized energy 
consumption. Taking the mixed-ADC architecture as an ex¬ 
ample, these two performance metrics are defined as 


Rma = 


Igmi 


Eh[log(l + ||hp£,)]’ 


Ema = 


Pma 
PCA ’ 


(48) 


in the single-user scenario under ergodic fading. That is, we 


simultaneously compare the spectral efficiency and the energy 
efficiency of the mixed-ADC architecture against those of 
the conventional architecture. These performance metrics can 
also be straightforwardly defined for antenna selection and for 
multi-user systems (there the sum achievable rates are used in 
Pma)- 


VII. Numerical Results 

In this section we validate our previous analysis with 
numerical results. Except for the first subsection, all the results 
in this section are for ergodic fading channels. The channel 
coefficients are drawn i.i.d. from 674(0,1). We deem SNR that 
achieves 5 bits/s/Hz for single-user scenario or 2.5 bits/s/Hz 
per user for multi-user scenario as a moderate SNR [25]. 


A. Outage-GMI for Random but Fixed SIMO Channel 

We first examine the outage performance of the mixed-ADC 
architecture. In this situation, the channel vector is random 
but fixed ever since it is chosen. Figure 2 displays the outage- 
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Fig. 6. GMI lower bound of the mixed-ADC architecture for ergodic fading 
channels, N = 100. 



Fig. 7. GMI lower bound of the mixed-ADC architecture for ergodic fading 
channels with Gaussian dithering, N = 100. 

GMI" for Pout — 5%. Scvcrcil obscrvcitions tire in order. First, 
Figure 2 shows that the mixed-ADC architecture with a small 
number of high-resolution ADCs achieves a large fraction of 
the outage-capacity of the conventional architecture. For ex¬ 
ample, when SNR = OdB, the mixed-ADC architecture with 
AT = 10 attains 85% of the outage-capacity of the conventional 
architecture, and this number rises to 92% when K = 20. 
Besides, it indicates that one-bit ADCs are less beneficial 
when the SNR grows large, but significantly improve the 
performance in the low to moderate SNR regime, compared 
with antenna selection. 

B. GMI for Ergodic Fading SIMO Channel 

By Figure 3, we first examine the tightness of the lower and 
upper bounds derived in Proposition 4. It is clear that the lower 
and upper bounds virtually coincide with each other, and as a 
result, it is sufficient to use only the GMI lower bound in the 
following numerical study for spectral efficiency evaluation. 

"The outage-GMI is defined as the largest GMI at a specified outage 
probability Pout. In this subsection, both the outage-GMI and the outage- 
capacity are obtained by running 1000 Monte Carlo simulations. 



o'-'-^-'-'- 

0 20 40 60 80 100 

Number of high-resolution ADC pairs: K 


Fig. 8. Per-user GMI under ergodic fading: comparison of random and 
norm-based ADC switch schemes, N = 100, M = 10. 



Fig. 9. Per-user GMI under ergodic fading: comparison with conventional 
architecture and antenna selection, N = 100, M = 10. 

Then, we turn to check the impact of imperfect CSI on 
the performance. Numerical results are given by Figure 4 
assuming MSEt = — lOdB, indicating that the gap between 
lower and upper bounds is still virtually negligible. On the 
other hand, though there is a noticeable rate loss due to channel 
estimation error, the mixed-ADC architecture with a small 
number of high-resolution ADCs still achieves much of the the 
channel capacity of the conventional architecture with perfect 
CSI. Besides, Figure 5 accounts for the impact of MSEt on the 
performance, from which we again conclude that the mixed- 
ADC architecture is robust against imperfect CSI. 

C. Performance Gain of Gaussian Dithering 

Figure 6 accounts for the effect of SNR on the GMI 
lower bound of ergodic fading channels, with special focus 
on small K. For the special case of iT = 0, we observe 
that increases first but then turns downward as the 

SNR grows large. Besides, as predicted by Corollary 4, FqmT 
asymptotically approaches a positive limit illustrated by the 
dashed line. The reason underlying this phenomenon is that 
the amplitude of the transmit signal cannot be recovered at the 
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Fig. 10. Energy efficiency comparison in the single-user scenaiio, N = 100. 



(c) SNR = 0 dB 




Fig. 11. Energy efficiency comparison in the multi-user scenario, N = 100, M = 10. 



(c) SNR = 5 dB 


receiver when the SNR is sufficiently large with only one-bit 
ADCs [21]. With merely one pair of high-resolution ADCs, 
is always increasing with SNR, and increases linearly 
with respect to 10 log]^Q(SNR) in the high SNR regime as 
predicted by Corollary 4. In addition, even though the rate loss 
due to pure one-bit quantization is significant in the high SNR 
regime, the GMI in the low SNR regime closely approaches 
those of A" > 0, as predicted by Corollary 3. 

Then we examine the performance gain of Gaussian dither¬ 
ing. For given N and SNR, we optimize the threshold T 
assuming K = 0, and then take the resulting Topt to evaluate 
the performance gain with K > 0. Figure 7 indicates that 
Gaussian dithering are able to achieve promising improvement 
in the spectral efficiency, especially for the case of AT = 0. 
Increasing either K or SNR, however, the benefit of dithering 
for K > 0 decays gradually, since the contribution of high- 
resolution ADCs tends to be dominating. 

D. GMI for Ergodic Fading MU-MIMO Channel 

Now, we examine the feasibility of the mixed-ADC architec¬ 
ture in the multi-user scenario. The performance comparison 
between random and norm-based ADC switch schemes is 
given by Figure 8. We notice that though the norm-based 
ADC switch is only analytically validated in low SNR regime, 
it does achieve better performance. Moreover, the lower and 
upper bounds of the GMI for each scheme still virtually 
coincide with each other. 

Figure 9 compares the achievable spectral efficiency of the 
mixed-ADC architecture with that of conventional architecture 


and antenna selection (using linear MMSE receiver for a 
fair comparison). Similar to the conclusion we obtained for 
the single-user scenario, here the mixed-ADC architecture 
with a small number of high-resolution ADCs also attains 
a large fraction of the rate of conventional architecture. As 
a numerical evidence, when SNR = 0 dB and N = 100, 
norm-based ADC switch with AT = 10 achieves 77% of the 
per-user rate of conventional architecture, and this number 
rises to 81% when we have K = 20. Meanwhile, the mixed- 
ADC architecture also achieves a noticeably higher spectral 
efficiency than antenna selection. 

E. Energy Efficiency 

We evaluate the energy efficiency improvement of the 
mixed-ADC architecture as well as antenna selection, taking 
conventional architecture as a baseline. We emphasize that 
spectral efficiency should never be excessively sacrificed for 
energy efficiency, thus confining the normalized spectral effi¬ 
ciency to 80% - 100%. 

Figure 10 illustrates the numerical results for a single-user 
system. We notice that, if 10% spectral efficiency degradation 
is allowed, then antenna selection can achieve more than 60% 
energy consumption reduction, and beyond that, the mixed- 
ADC architecture can further reduce the energy consumption 
by about 10%, in low to moderate SNR regime. Besides, it is 
perhaps worth noting that, in the high SNR regime, antenna 
selection may achieve higher energy efficiency than the mixed- 
ADC architecture, since now one-bit ADCs are getting less 
beneficial as demonstrated by Corollary 4. 
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Regarding the multi-user scenario, Figure 11 reveals more 
pronounced superiority of the mixed-ADC architecture over 
antenna selection. The mixed-ADC architecture always outper¬ 
forms antenna selection throughout the considered SNR range, 
and we note that the gap will further increase as the system 
load (i.e., the number of users M) increases. For the system 
parameters in Figure 11, it appears that spectral efficiency and 
energy efficiency arrive at an attractive tradeoff at K Ri 20, 
where we sacrifice a 20% loss on spectral efficiency to trade 
for a 70% reduction on energy consumption. 

VIll. Conclusion 

The numerous BS antennas enable massive MIMO systems 
to achieve unprecedented gains in both spectral efficiency and 
radiated energy efficiency, but also make the hardware cost 
and circuit power consumption increase unbearably, demand¬ 
ing energy-efficient design of transceivers. In this paper, we 
propose a mixed-ADC receiver architecture for the uplink, and 
leverage GMI to analytically evaluate its achievable data rates 
under various scenarios. Numerical results demonstrate that 
the mixed-ADC architecture with a relatively small number of 
high-resolution ADCs is able to achieve a large fraction of the 
channel capacity of conventional architecture, while reduce the 
energy consumption considerably even compared with antenna 
selection, for both single-user and multi-user scenarios. We 
envision the mixed-ADC architecture as a compelling choice 
for energy-efficient massive MIMO systems. 

A number of interesting and important problems remain 
unsolved beyond this paper, such as designing the optimal 
ADC switch scheme for any SNR, especially for the multi¬ 
user scenario; making full use of the available one-bit ADCs 
when acquiring the CSI; extending the analysis to hardware 
impairment models besides ADC; among others. Additionally, 
in order to make this approach effective for wideband chan¬ 
nels which are more prevailing in the future communication 
systems, it is particularly crucial to extend the analysis to 
frequency-selective fading channels. When one adopts multi¬ 
carrier transceiver architectures like OFDM, since one-bit 
ADCs are applied in the time domain rather than the frequency 
domain, severe inter-carrier interference due to quantization 
is inevitable and thus the decoder needs to properly account 
for this, say, by using a vectorized nearest-neighbor decoding 
algorithm and evaluating the resulting GMI. This is feasible 
but beyond the scope of this paper, and is currently treated in 
a separate work. 


Proof: Applying [28, Prop. 2], we obtain the following 
relationship, 

(j){s,t)dtds = \ + 7 ^arcsin(p). (50) 

Then exploiting the symmetry of it is straightforward 

to verify that 



= 2 


E[sgn(5) • sgn(r)] 

/ / (f>{s,t)dtds — / / (j){s,t)dtds 
■J J st>0 J si<0 

/ / t)dtds — 1 

J Jst>0 

pOO pOO 

4 / / fis, t)dtds — 1 

Jo Jo 


= — arcsin(/ 2 ). 

TT 


(51) 


Lemma 2. For independent complex Gaussian random vari¬ 
ables S ^ CN(0, 0 's) and T ^ CN(0,a'(), we have 


E[5* • sgn(5 + T)] 


E[S -sgn^S + T)] 



(52) 


Proof: With some manipulation, we have 


(o) 


E[^* • sgn(5 + T)] 

E[S^ ■ sgn(5'^ -b T^)] -b E)^^ • sgn{S^ + T^)] + 
i ■ E[5'^ • sgn)^^ -b T^)] - i ■ E)^^ • sgn{S^ + T^)] 

^2 


7r{a^j2 + a?/2) 



n{all2 + a^l2) 


Tr(cr^ + (Jt) ’ 


(53) 


where (a) follows from [18, Eq. (19)], the independence 
between and S'^-bT\ as well as between and S'^-bT^. 


Now we are ready to evaluate |E[/*(x, h,z) • x]\^ and 
E[|/(x, h, z)p]. For given w and 6, we have 

|E[/*(x,h,z) • xjp = (54) 

where Rrx — E[ra;*] is the correlation vector between r and 
X, whose n-th element is 


Appendix 

A. Derivation of k{w, d) 

We first introduce two lemmas that will help us derive a 
closed-form expression of n(w, 5). 

Lemma 1. For zero-mean real Gaussian random variables S 
and T with covariance matrix K, letting 4>{s,t) denote their 
joint probability density function (PDF) and p represent their 
correlation coefficient, we have 

E[sgn(S') • sgn(r)] = —arcsin(p). (49) 

TT 


(Rra:)n 

— ■ E[x * (^hjiX “b -2'n)] ' E[x * Sgn[/i^X “b 2^n)] 


S„ ■ hnE-s + 5n ■ 

— T 


7r(|/i„|2£s -b 1) 


7r(|/i„P£s + 1) 


(55) 


Here, (a) follows directly from Lemma 2. 

On the other hand, it is straightforward that 

E[|/(a;,h,z)n = E[w^rr"w] = w^RrrW, 


(56) 
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where Rrr = E[rr^] is the covariance matrix of r. The 
diagonal elements of Rrr are given by 

(Rrr)n,n 

= E[|(5„ • {hnX + Zn) + Sn ■ sgn(/i„x + z„)p] 

= • mhnX + Zn\^] + (5„ • E[|sgn(/l„X + 

= • (|^np£s + 1) + • 2 

= 1 + <5ti • |^nP£s + (57) 

while the nondiagonal elements can be obtained by applying 
both Lemma 1 and Lemma 2, as follows. First, applying 
Lemma 2 we have 

E[y„ • sgn*(i/„)] 

= E[{hnX + Zn) ■ Sgn* {hmX + Zm)] 

= E [^E[(/l77,ir-f Zn) * Sgn (^hmX Zm)|ai, Z7n]] 

= E[hnX ■ sgn* {hmX + Zm)] 


= hnh*^Zs 


Now we can combine (60)-(62) to get E[sgn(y„) • sgn*(yni)] 
as follows 

E[sgn(yn) • sgn*(y m)] 

4 . ( \ , 

y \/|L„P£s + + 1 j 

^ y \/|^raP£s + l\/|^mP£s + 1 ) 

Further, from (58), (59) and (63), we obtain (Rrr)n.m, given 
as 

(Rrr)n,m 

■ ®^[ynyrn] “f ^n^^m ‘ ® [yn * (ym)] T 

SnSm ■E[sgn{y n)-y^] + 4(^m-IE[sgn(?/„)-sgn*(2/„j)] 


= hnhlE, 


^n^ra “t“ 


7r(|/lmP£s + 1) 


1 ^ (58) 

4 

1 7r{\hn\^ £, + !)’ dndm-d 

7r(|/i„|2£s + 1) 


and analogously 

E[sgn(y„) • y^] = /i„/i)„£s 


A A 1 


arcsin 


in 


{hnhl,)^L 


7r(|/lmp£s + 1)' 
Then, we turn to evaluate E[sgn(j/n) • sgn*(2/m)]; that is 


(59) 


z-arcsin 


\ \/|L„p£s + ly^\hjn\'^£.s + 

{hnh*n,)^e. 


E[sgn(y„) • sgn*{y 

m)] 

= E[sgn( 2 /)^) • sgn(i/)^)] + E[sgn( 2 /),) • sgn{yl^)] - 

1 ■ E[sgn(i/)^) • sgn{yl^)] + i ■ E[sgn{yl) ■ sgn{y^)] 

2 2 

= —arcsin(pyR ) H—arcsin(pj,i^ j,i^) — 

2 2 

i ■ — arcsin(pyR j,i^) + i ■ — arcsin(pj,i^ ^r ), (60) 

where the last equation follows from Lemma 1. To proceed, 
we need to evaluate some correlation coefficients, e.g., PyR,yR, 
which is given as 


y|/i„|2£, +V|/r™|2£s + l^ 

Thus we conclude the proof. 


(64) 


B. Asymptotic behavior of IcMii'Wopt, in low SNR regime 
For simplicity of exposition, we define 

R°y lim ^Rr,, R°r=Hm Rrr. (65) 


^s-i■o £s 




Then from (13) and (14), it is straightforward to verify that 

2 


(Rr£c)n “ ^ri 


T ' t — 


Pv^,i 


Ebny 


R„,Rl 


./nMnvwMF] 

E[ih^x^-hix^ + z^){hlx^-hlx^ + z^)] 

^E[{h'^x^-hix^+z^)^]^yE[{h^x^-hl^x^ + z^)^] 

{hnh^ + 


Rrr = diag(l + (5i,..., 1 + (5n,..., 1 + (^at). (66) 

Thereby we examine the asymptotic behavior of «;(wopt,<5) 
as £s —>■ 0; that is 

K(Wopt,d) 

Es —^0 £s 


* tRf 


ihnh*J^£.s 


v'|/i„P£s +V|/imp£s + l 
Besides, following essentially the same line we have 

71’yin 


( 61 ) 


(a) 

(b) 

(c) 


lim 
£s—^0 V o 


( Rrx) Rrr^( Rr.) 


( lim ^Rrx) ( lim Rrr^) ( lim 
\e,-^ot.s 7 v^s-i•o y\Es-^ocs 


Rr 


(R°,)yR°r)-^(R°.) 

^{Sn + 5n-i)\hn\^ 


Pyl^,y^ Py^,yl^ 


ihnh*j^e 


n=l 

N 


1 + 


\/|/l„p£s + l\/\hm\'^£.s + 1 


. (62) — 'y ) i^n “f ■ j \hn\ ; 

n = 1 ' ^7 


(67) 
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where (a) follows from (17), (b) is obtained by applying the 
algebraic limit theorem since the limits of R°^/£s and Rrr 
exist, while (c) comes from the fact that the inverse of a 
nonsingular matrix is a continuous function of the elements 
of the matrix, i.e., lim£^^.o R“^ = (lim£^_>o Rrr)“^ [29]. 
As a result, when £s —?> 0 we have 

w / _ 2\ 

K(Wopt, ^ Un + • - j |ft.np£s + o(£s). (68) 

n=l ^ ’’"2 

Noting that log(l + x/(l — x)) = x + o(x), as x —>■ 0, we 
immediately have (22). 


inverse of partitioned matrix [31], we obtain 

C = (A-UB^^U^)-! 

_ £s [tt - (4 + 0(l/£s))q^B-^q] ■ pp^ 

7r + £s[7r- (4 + 0(l/£s))q^B-iq] • ||p||2’ 

V = -A-iU(B - 

_ + 0(l/v^)] ■ pq^B"i 

TT + £s [tt - (4 + 0(l/£s))q^B-iq] • ||p|P ’ 

D = (B-U'^A-^U)-^ 

^ 1 [4£. + 0(l)]||pf .B-iqq^B-i 

TT + £s [tt - (4 + 0(l/£s))q^B-iq] • ||pP ' 

(74) 


C. Asymptotic behavior o/JGMi(wopt, <5) in high SNR regime 

For simplicity of exposition, we rearrange h and stack the 
channel coefficients corresponding to the antennas equipped 
with high-resolution ADCs in the first K positions of h. To 
proceed, we further define 


P = 

q — [iLK+i/lbiK+il^■ (69) 


When £s tends to infinity, we have 


h„£s 


7r(|Zi„p£s + 1) 


\/4£s/7r + 0(1/y/^) 


K 

\h„ 


(70) 

for n = A + 1,..., A. As a result, we are allowed to denote 
the deduced R^^ as 


R. 


£sP 

(y4£:7^+0(l/P£;))q 


(71) 


With all of these, we are ready to simplify K(wopt, 5)', that is, 

K(Wopt,^) 

— ^ P ^ P —Ip 

£ “ra;—rr ^ra: 


= £sp Cp + 2 


(^P4£s/7r - 
(4/7r + 0(l/£s)) q^Dq 
£. [^-(4 + 0(l/£4)q^B-iq] -11 
TT + £s [tt - (4 + 0(l/£s))q-f^B^iq] 
_ [4 + 0(l/£P]q^B-iq 

TT + £s [tt - (4 + 0(l/£s))q-f^B~-iq] 
Finally, we get the effective SNR as 


p^Vq 


(75) 


K(wopt,(5) 

1 - K(Wopt,^) 


= l|pf£.+ 


[4 + 0(l/£P]q^B-lq 

TT - [4 + 0(l/£s)]q^B-iq' 
(76) 
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Besides, we denote by partitioned matrices R^^ ^^d its inverse 
R“/, i.e.. 


R.. = 


A U 
B 


1 A 


C V 
D 


(72) 


in which the invertible square matrices A £ £Kxk^ b e 
f£(N-K)x{N-K) rectangle matrix U £ £^kx{n-k) 

are taken to be 


A 

U 


= I + £spp 


H 


(\/4£s/7r-f 0(l/\/^))pq^, 




arcsin 


i—n+K—m+K ) 




I ■ arcsin 


(h 


.n+K—ra-\-K ) 

IZln+iT^m+iC I 


0(l/£,). 


(73) 


Then, applying the Sherman-Morrison formula [30] and the 


References 

[1] T. L. Marzetta, “Noncooperative cellular wireless with unlimited numbers 
of base station antennas,” IEEE Trans. Wireless Commun., vol. 9, no. 11, 
pp. 3590-3600, 2010. 

[2] F. Rusek, D. Persson, B. K. Lau, E. G. Larsson, T. L. Marzetta, 0. Edfors, 
and F. Tufvesson, “Scaling up MIMO: Opportunities and challenges with 
very large arrays,” IEEE Signal Process. Mag., vol. 30, no. 1, pp. 40-60, 

2013. 

[3] H. Q. Ngo, E. G. Larsson, and T. L. Marzetta, “Energy and spectral effi¬ 
ciency of very large multiuser MIMO systems,” IEEE Trans. Commun., 
vol. 61, no. 4, pp. 1436-1449, 2013. 

[4] F. Boccardi, R. W. Heath, A. Lozano, T. L. Marzetta, and P. Popovski, 
“Five disruptive technology directions for 5G,” IEEE Commun. Mag., vol. 
52, no. 2, pp. 74-80, 2014. 

[5] E. Bjomson, J. Hoydis, M. Kountouris, and M. Debbah, “Massive MIMO 
systems with non-ideal hardware: Energy efficiency, estimation, and 
capacity limits,” IEEE Trans. Inf. Theory, vol. 60, no. 11, pp. 7112-7139, 

2014. 

[6] E. Bjomson, M. Matthaiou, and M. Debbah, “Massive MIMO with non¬ 
ideal arbitrary an'ays: Hai'dware scaling laws and circuit-aware design,” 
IEEE Trans. Wireless Commun., vol. 14, no. 8, pp. 4353-4368, 2015. 

[7] U. Gustavsson, C. Sanchez-Perez, T. Eriksson, F. Athley, G. Durisi, P. 
Landin, K. Hausmair, C. Fager, and L. Svensson, “On the impact of 
hardware impaimients on massive MIMO,” IEEE GLOBECOM Work¬ 
shop, 2014. 

[8] R. Walden, “Analog-to-digital converter survey and analysis,” IEEE J. 
Set Areas Commun., vol. 17, no. 4, pp. 539-550, 1999. 




























15 


[9] B. Le, T. Rondeau, J. Reed, and C. Bostian, “Analog-to-digital convert¬ 
ers,” IEEE Signal Process. Mag., vol. 22, no. 6, pp. 69-77, 2005. 

[10] J. Singh, O. Dabeer, and U. Madhow, “On the limits of communication 
with low-precision analog-to-digital conversion at the receiver,” IEEE 
Trans. Commim., vol. 57, no. 12, pp. 3629-3639, 2009. 

[11] A. Mezghani, M. S. Khoufi, and J. A. Nossek, “A modified MMSE 
receiver for quantized MIMO systems,” in Proc. IEEE Workshop on Smart 
Antennas (WSA), 2007. 

[12] H. Yin, Z. Wang, L. Ke, and J. Wang, “Monobit digital receivers: Design, 
performance, and application to impulse radio”, IEEE Trans. Commim., 
vol. 58, no. 6, pp. 1695-1704, 2010. 

[13] C. Risi, D. Persson, and E. G. Larsson, “Massive MIMO with 1-bit 
ADC,” arXiv:1404.7736, 2014. 

[14] J. Mo and R. Heath, “High SNR capacity of millimeter wave MIMO 
systems with one-bit quantization,” in Proc. of Information Theory and 
Applications (ITA) Workshop, 2014. 

[15] M. T. Ivrlac and J. A. Nossek, “On MIMO channel estimation with 
single-bit signal-quantization,” ITG Smart Antenna Workshop, 2007. 

[16] A. Ganti, A. Lapidoth, and 1. E. Telatar, “Mismatched decoding revisited: 
General alphabets, channels with memory, and the wide-band limit,” IEEE 
Trans. Inf. Theory, vol. 46, no. 7, pp. 2315-2328, 2000. 

[17] A. Lapidoth and S. Shamai, “Fading channels: How perfect need ’perfect 
side information’ be?” IEEE Trans. Inf. Theory, vol. 48, no. 5, pp. 1118- 
1134, 2002. 

[18] W. Zhang, “A general framework for transmission with transceiver 
distortion and some applications,” IEEE Trans. Commim., vol. 60, no. 
2, pp. 384-399, 2012. 

[19] M. Vehkapera, T. Riihonen, M. Gimyk, E. Bjomson, M. Debbah, L. 
K. Rasmussen, and R. Wichman, “Asymptotic analysis of SU-MIMO 
channels with transmitter noise and mismatched joint decoding,” IEEE 
Trans. Commim., vol. 63, no. 3, 749-765, 2015. 

[20] A. Guillen i Fabregas, A. Martinez, and G. Caire, “Bit-interleaved coded 
modulation,” Found. Trends Commun. Inf. Theory, vol. 5, no. 1/2, pp. 1- 
153, 2008. 

[21] S. Jacobsson, G. Durisi, M. Coldrey, U. Gustavsson, and C. Studer, 
“One-bit massive MIMO: Channel estimation and high-order modula¬ 
tions,” arXiv: 1504.04540, 2015. 

[22] K. Knudson, R. Saab, and R. Ward, “One-bit compressive sensing with 
norm estimation,” arXiv:1404.6853, 2014. 

[23] O. Dabeer and A. Kai'nik, “Signal pai'ameter estimation using 1-bit 
dithered quantization,” IEEE Trans. Inf Theory, vol. 52, no. 12, pp. 5389- 
5405, 2006. 

[24] Y. Li, B. Bakkaloglu, and C. Chakrabarti, “A system level energy model 
and energy-quality evaluation for integrated transceiver front-ends,” IEEE 
Trans. Very Large Scale Integr. (VLSI) Syst., vol. 15, no. 1, pp. 99-103, 
2007. 

[25] Feasibility study for Further Advancements for E-UTRA (LTE- 
Advanced), 3GPP TR 36.912-vl2.0.0, 2014. 

[26] J. Borremans, B. van Liempd, E. Martens, S. Cha, and J. Craninckx, 
“A 0.9V low-power 0.4-6GHz linear SDR receiver in 28nm CMOS,” in 
Symp. on VLSI Circuits, 2013. 

[27] Q. Bai and J. A. Nossek, “Energy efficiency maximization for 5G multi¬ 
antenna receivers,” Trans. Emerging Telecommim. Teclmol, vol. 26, no. 
1, pp. 3-14, 2015. 

[28] T. Koch and A. Lapidoth, “Increased capacity per unit-cost by oversam¬ 
pling,” arxiv:1008.5393, 2010. 

[29] G. W. Stewart, “On the continuity of the generalized inverse,” SIAM J. 
Appl. Math., vol. 17, no. 1, pp. 33-45, 1969. 

[30] R. Horn and C. R. Johnson, Matrix Analysis, Cambridge University 
Press, 2012. 

[31] H. Hotelling, “Some new methods in matrix calculation,” Ann. Math. 
Statist, vol. 14, no. 1, pp. 1-34, 1943. 


_ Ning Liang received his B.E. degree in Communi- 

cation Engineering from University of Science and 
Technology of China (USTC) in 2012. He is now a 
Ph.D. student in Wireless Communications at USTC, 
^ Hefei, China. His reseai'ch interests include network 

interference analysis and low-complexity receiver 
design for massive MIMO systems. 

A 


Wenyi Zhang (S-00, M-07, SM-11) is with the 
faculty of Department of Electronic Engineering 
and Information Science, University of Science and 
Technology of China. Prior to that, he was affiliated 
with the Communication Science Institute, Univer¬ 
sity of Southern California, as a postdoctoral re¬ 
search associate, and with Qualcomm Incorporated, 
Corporate Reseai'ch and Development. He studied 
in Tsinghua University and obtained his Bachelor’s 
degree in Automation in 2001; he studied in the Uni¬ 
versity of Notre Dame, Indiana, USA, and obtained 
his Master’s and Ph.D. degrees, both in Electrical Engineering, in 2003 and 
2006, respectively. His research interests include wireless communications and 
networking, information theory, and statistical signal processing. 



